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Abstract 

Intuitively, if we can prove that a program terminates, we expect some conclusion re- 
garding its complexity. But the passage from termination proofs to complexity bounds is 
not always clear. In this work we consider Monotonicity Constraint Transition Systems, a 
program abstraction where termination is decidable (based on the size-change termination 
principle). We show that these programs also have a decidable complexity property: one 
can determine whether the length of all transition sequences can be bounded in terms of the 
initial state. This is the bounded termination problem. Interestingly, if a bound exists, it 
must be polynomial. Wc prove that the boimded termination problem is PSPACE-complete 
and, moreover, if a bound exists then a symbolic bound which is constant-factor tight (in the 
univariate case) can be computed in PSPACE. We present this computation in the form of 
computing a reachability bound, a bound on the number of visits to a given program location. 
This presentation is inspired by the practical usefulness of this problem formulation. 

We also discuss, theoretically, the use; of bounds on the abstract program to infer conclu- 
sions on a concrete program that has been abstracted. The conclusion maybe a polynomial 
time bound, or in other cases polynomial space or exponential time. We argue that the 
monotonicity-constraint abstraction promises to be useful for practical complexity analysis 
of programs. 

1 Introduction 

On Complexity Analysis of programs. Automatically inferring complexity properties of 
computer programs is a well-established subfield of static analysis (the related work section will 
provide bibliographic references). The topic received renewed attention from static analysis re- 
searchers in recent years, sometimes called cost analysis, hound analysis or growth-rate analysis. 
The overall goal is to develop algorithms that can process a sTibject program and answer ques- 
tions about its complexity, where complexity may refer to various measures of resource usage 
such as running time, memory usage, stack usage, etc. 

It is well-known that in the analysis of algorithms, questions about precise running time 
(in physical units) are usually abandoned, since studying this measure involves many properties 
of complex hardware systems as well as the software platform, which shift the focus from the 
algorithm itself. In program analysis, one can distinguish works that concentrate on the real- 
time dimension (often going by the keyword WCET — worst-case execution time analysis), and 
works that concentrate on more robust (and abstract) program-based measures such as number 
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of instructions executed or just the number of iterations of a loop. Naturally, some works involve 
both, to varying degrees, however our work only addresses the program-based analysis. 

A typical question that a program analyzer for complexity may be asked to answer is: give 
an expression of the cost (say, execution time — which we shall understand as the number of 
program steps) in terms of (some designated) input values. 

Since we are not measuring real time anyway, it seems reasonable, as in algorithm textbooks, 
to neglect input-independent constants and use the O-notation. This simplifies the problem, but 
does not change the basic challenge. Even if we only ask for a complexity class, for example to 
separate polynomial-time programs from super-polynomial ones, this problem is still undecidable 
in every Turing-complete programming language. This means that there is no hope to solve the 
problem! How can an algorithm designer overcome such an obstacle? We list a few alternative 
approaches (in the context of complexity analysis). 

• Focus on specially-designed languages. Such works often grew out of the research on 
Implicit Computational Complexity (ICC). In fact, a typical result in this field is the proof 
that a complexity class is precisely captured by a particular sub-recursive (Turing incomplete) 
programming language. But these languages force the user to program in a particular way, 
often too unnatural. Other works show that for suitably restricted languages, the complexity 
classification is not predetermined but is decidable. This is an advantage, as it means that the 
language is less restricted and a more natural programming style should be possible. 

• Give up a complete solution to the problem. This is actually the common approach in 
the field of static analysis, since research in this field often takes the programming language 
for given. One then produces analyses that can have "false negatives" or "false positives"; in 
complexity analysis, the most common goal is to provide an upper bound, thus the question "is 
the program polynomial?" will occasionally be answered by a false negative, resulting of an 
overshot upper bound. 

• A third approach — perhaps a middle road — may be described as abstract and conquer. 
The idea is to first translate a program from its original language into an abstract form, and 
then analyze the abstract form; a useful abstraction captures important aspects of the source 
program, but it is in the nature of abstraction to lose some precision. One may hope, then, 
that for abstract programs one really can solve the problem of interest. This may require the 
development of a good definition of the analysis goals in the abstract world. This approach 
can already be seen in different fields of program analysis, including complexity analysis, as 
we will mention in more detail below. It has several benefits, in particular, theoretically, the 
abstract program model may be sufficiently simple to develop a firm theoretical understanding; 
as problems may be decidable, one can may be able to progress to proving their computational 
complexity. Practically, the approach suggests a separation of concerns among a front end and 
a back end, and promotes modularity in tool construction. 

Termination Analysis. Termination Analysis is another much-studied topic in program anal- 
ysis. Intuitively, a termination proof seems likely to reveal something about the complexity of the 
subject program, since proving termination means proving that the complexity is bounded. It 

is, therefore, natural to try to extend work on termination proofs to obtain complexity bounds. 
In fact, some works on complexity analysis have already exploited techniques from termina- 
tion analysis (polynomial interpretation of terms in [BCMTOl, CL92]; ranking functions in 
[AAGP08, ADFGlOa]). In this work too our goal was to examine certain theoretical and al- 
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gorithmic results from termination analysis and evolve them to obtain results in complexity 
analysis. Specifically, we study the monotonicity constraint abstraction. 

Constraint Transition Systems. A Constraint Transition System (CTS) is an abstract 
program which is based on viewing the semantics of the program as an infinite-state transition 
system which has a finite description. The components of this description are: first, a control 
flow graph (CFG), which is a finite directed graph; we refer to its nodes as flow points. Typically, 
they represent concrete locations in the source code of the subject program. Second, a finite set 
of variables associated with every flow point; a state is specified by {£,xi, . . . ,Xn) where £ is a 
flow point and xi, . . . ,Xn the values of the variables. The variables may represent actual program 
data, abstractions like the size of an object (a list, a tree, a set etc.), in some cases program 
constants, and in some cases "invented" variables (created by the analysis tool). Finally, every 
arc of the CFG, to which we refer as an abstract transition, is associated with a formula that 
represents a relation on source states and target states (the transition relation). We refer to 
this formula as a constraint. A common notation for constraints is to denote the target state 
variables by primed identifiers. So, for example, x > x' means that the new value of variable 
X is smaller than the old one. Figure 1 shows a small program and a possible abstraction to 
constraints (in fact, to monotonicity constraints, as defined below. The reader should be able 
to see that the constraints suffice for deducing that the loop always terminates). Additional 
examples appear in later sections. 

So far, the definition has been very general, and practically any program representation or 
computational model of finite description can be represented in this way. However, certain kinds 
of CTS are more frequent in program analysis. To specify a particular kind of CTS, we have to 
specify the kind of constraints allowed and over what carrier set they work. In this paper, we 
employ the notation (C, P)-CTS for a CTS that applies constraints of type C to the domain V. 

Monotonicity constraints were introduced to termination analysis as early as 1991 [Sag91]. 
These are constraints that only use order relations > and >, and their use in termination 
analysis stems from the idea of proving termination by identifying a descending sequence — a 
pattern typical to Logic and Functional programming, where one often recurses on values such 
as terms, trees or lists while shrinking them. Hence size-change termination, a name given to 
this approach in [LJBAOl]. The precise abstraction used in the latter work is this: Constraints 
are conjunctions of relations of the form x > y' or x > y' . They are referred to as size-change 
graphs (SCG). Thus, the abstraction employed by size-change termination (a-la [LJBAOl]) may 
be expressed as {SCG, Ord)-CTS, where Drd stands for "any well-ordered set." 

When one looks at earlier papers using monotonicity constraints (e.g., [Sag91, LS97]), one 
may notice that their constraint formulae are not restricted to size-change graphs — there was 
no prohibition of constraints such as x < x' (an increase, rather than decrease) oy x < y 
constraint on source-state variables) or x' < y' . We refer to this constraint domain as MC. It 
also is clear that the intended domain is the non-negative integers. In 2005, Codish, Lagoon 
and Stuckey [CLS05] began the extension of size-change termination theory to monotonicity 
constraints and the integers. To illustrate the need for refining the theory, note that a loop 
described by the constraint x<x'Ax<yAy = y',a common pattern in imperative programs, 
does not satisfy size-change termination (there are well-ordered sets in which this can be repeated 
forever), but terminates over the integers. Note also that when arguing for its termination over 
the integers, the assumption x, y > is redundant, and in fact in imperative programs the 
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Program 1 



CFG and constraints 



while x>z do 



n 



(x,y) := (y,x-l) 



o 



X > zA 
y = x'A 
X > y'A 
z = z' 



Figure 1: CTS abstraction of a simple program (1). 



important variables for loop control are often of integer type, and can be (by design or by 
mistake) negative too. Note also that the usage of constraints which is not of the "size-change 
graph" type. This motivated the study of Z)-CTS in [BAll]. Two significant results 

of this study are: (1) termination of (7WC,Z)-CTS is decidable; it is PSPACE-complete. (2) 
There is an algorithm for constructing global ranking functions for terminating (A^C,Z)-CTS 
instances. 

Other types of CTS have also appeared in termination analysis as well of complexity analysis; 
more on this below. 

Complexity Analysis of Abstract Programs. Stated succinctly, a CTS represents a tran- 
sition relation (relation on the set of states) and the goal of termination analysis is to prove 
that this relation is well-founded. A natural notion of complexity for the abstract program is 
the (worst-case) number of transitions starting from an initial state (a state where the program 
is at its designated point of entry), which we would like to bound in terms of the variables at 
that initial state (or a few designated variables). 

Our research on complexity analysis of (A^C,Z)-CTS has been inspired by two earlier works 
on the complexity analysis of programs, which are both based on a CTS abstraction: the COSTA 
system of Albert et al. [AAGP08, AAGPIO], which targets Java bytecode programs, and the 
WTC analyzer of Alias et al. [ADFGlOa] , targeting C programs. For the purpose of this pre- 
sentation, we follow the latter (more on the former in Section 2). The abstraction used is 
(yljff, Z)-CTS where ^jff denotes a constraint language where a constraint is a conjunction of 
linear (afiine) inequalities, for example: x < 1 Ax + y < z. It should be clear that (A^C,Z)-CTS 
is a sub-model of {Aff, Z)-CTS. As for analysis of the abstract program, the method is to search 
for a lexicographic linear ranking function. Roughly speaking, this is a function of the form 
Pi^xi, . . . , Xn) = (/£,i(x), . . . , /£^(i(x)) where each f^^i is an affine function on Z"" whose values 
in reachable program states {i, x) are guaranteed to be non-negative. Moreover, the value of 
this function decreases (lexicographically) in every transition. It is easy enough to see that 
this proves termination; it also permits one to bound the running time. The bound will be a 
polynomial of degree d (the length of the longest tuple used, also referred to as the dimension). 
Interestingly, among all functions that satisfy the conditions which [ADFGlOa] poses in the 
search for ranking functions, the algorithm provably finds one of smallest dimension. 

Both of the above works were accompanied by front-ends that abstracted programs, demon- 
strating the applicability of the approach to analysis of concrete programs in the respective 
languages. 

The (7WC,Z)-CTS abstraction has, previous to our work only been used for proving termi- 
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nation . Thus, our first contribution is to define the property of bounded termination in this 
particular context. This may seem a trivial step, but introducing this definition was important 
as it expressed our realization that not for every terminating CTS can a complexity bound be 
obtained (this will be shown precisely in Section 3). Hence, the class of bounded-terminating 
instances is a subset of the terminating ones. Now we can ask about the decidability and com- 
plexity of this set. Our fundamental result is a proof that bounded termination is decidable. 
Moreover: we prove that it is PSPACE-complete. This is the same complexity as for termina- 
tion; and indeed we re-use some techniques from the work on {MC, Z)-CTS termination in both 
the upper bound proof and the hardness proof. Unlike [AAGP08, ADFGlOa, ZGSVll], we do 
not use ranking functions. 

An interesting consequence of our proofs was the discovery that bounded termination implies 
that the bound obtained is always polynomial (in terms of the initial values). Note that this is 
an inherent property — not an artifact of the analysis algorithm. 

Given this result, the natural next step is to ask how hard it is to obtain the precise degree 
of the bounding polynomial (which, for univariate polynomials, determines the bound up to 
a constant factor). Our theorem, proved in Section 5, shows that this too is decidable — and, 
naturally, we also determine its complexity class, which is still PSPACE-complete. 

While turning attention from bounded termination in general to precise degree bounds, we 
also change, in Section 5, the object of our study from a bound on the length of a computation 
to the number of visits to a prescribed point in the program. For this measure we adopt the term 
reachability bound (RB), introduced in [GZIO]. Defining the problem in terms of the reachability 
bound has some advantag GS, ctS ct procedure to determine the RB can be easily put to different 
uses. If the property of interest is the total length of a computation, it is possible to bound it 
by computing the RB for selected cut points in the program (for a "big O" bound, it suffices to 
ensure that every cycle includes a cut point). The RB problem is more general, since different 
flow-points may have different bounds. If we are interested in consumption of some resource, 
consumed at specific points in the program, we can compute the RB for those points. Moreover, 
computation of flow-point RBs helps modularity in the following sense: suppose that certain 
flow-points /i, /2, • . . are in fact procedure calls and we have computed costs of these procedures, 
Ci. We then obtain an overall bound (though not always tight) by multiplying the RB of each 
fi by its associated cost. 

To state our theorem, we define a decision problem, RBD (for reachability bound degree): for 
a given flow-point and degree k, is the RB for the flow-point a polynomial of degree at least k? 
This decision problem is proved PSPACE-complete. The algorithm proving the upper bound 
is based on the notion of a fully- elaborated CTS from [BAlOb], where it was introduced for 
constructing global ranking functions. Here, however, it is combined with closure computation 
(a well-known technique in size-change termination) and uses a new result (Theorem 5.10) that 
shows how the RB degree is determined by the closure set. A notable corollary of our analysis 
is that a constant-factor tight reachability bound is always a polynomial (not just polynomially 
bounded). Here it is assumed that the bound is expressed as a function of a single independent 
variable N (this may actually turn out to be the maximum of several actual inputs, a difference 
of inputs, etc), so that by a tight bound is meant a bound f{N) such that the actual worst-case 
bounds in terms of A'^ is Q(f{N)). 

'^Concurrently to our work, it was also put to use in complexity analysis by Zuleger et al. [ZGSVll]. 
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Program 2 



CFG and constraints 



i=N; 

while (i>0) { 
if (j>0) j-; 
else {j=N; i — ;} 

} 



(3) 



(1) 



(2) 



(1) i > A Samc(N, 0,1, j) 

(2) j > OA j > j' ASame(N,0,i) 

(3) j < OA j' = N'A i > i' ASame(N,0) 



Figure 2: CTS abstraction of a simple program. The notation Same(x, y, ■ ■ ■) is syntactic sugar 
for indicating abstract variables that are constant in the transition (see Section 3). 



{M.C, Z)-CTS as a back-end. Our paper can be viewed as a theoretical study of {MC, Z)-CTS. 
However, we argue that such constraint transition systems are useful as an abstraction of "real" 
programs. To support this claim, we have to discuss the manner in which a concrete program 
is modeled by a (A1C,Z)-CTS. 

In termination analysis, the concrete-abstract connection is always based on the following 
principle: // the concrete program has an infinite execution, the abstract program will have 
one. This is achieved in different ways depending on the nature of the concrete program (e.g., 
imperative versus pure- functional) . Complexity analysis complicates this relationship: the above 
principle clearly does not sufHce. It is therefore necessary to discuss what conclusions on the 
concrete program may be drawn from bounded termination, or a reachability bound, for the 
abstraction. 

Section 6 is dedicated to this discussion. Our choice is to keep this paper concentrated on 
the theory of (7WC,Z)-CTS; therefore this discussion is quite informal. The support for our 
arguments here is not theorems and proofs, but the practical experience of researchers who, 
previous to this work, have already used a CTS abstraction for complexity analysis. We discuss 
how this abstraction has been done in [ADFGlOa] and [AAGP08]. The fact that they used a 
richer constraint language has no consequence for this question. 

Briefly, the simplest case is of an imperative program, without procedure calls. The CFG of 
the {MC, Z)-CTS is essentially the flow-chart of the program, and the length of the computation 
is related to time complexity. 

Next, we consider programs with recursive functions. We argue that for such programs, 
bounded termination most naturally yields a bound on stack height. Depending on the program's 
use of "heap space," we may be able to conclude that it runs in polynomial space, or just deduce 
an exponential time bound. 

The fact that our abstraction is coarser than the one used in the cited works is relevant to 
another concern: the loss of information due to abstraction. Section 6.4 discusses the impact of 
relaxing the {Ajf,Z)-CTS abstraction (or possibly even stronger ones) to (A1C,Z)-CTS. Such 
relaxation, which may suffice for termination, does not always suffice for complexity analysis. 
An example can be seen in Figure 2: for termination, we could do with a simpler abstraction, 
eliminating all constraints involving the variable N. But then we would not obtain a bounded- 
terminating CTS. 

Our thesis is that, despite its relative simplicity, the monotonicity constraint abstraction 
stands a good chance of being effective in practice (when used judiciously). The ultimate test 
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would, of course, be the construction of an industrial-strength tool; this is far beyond the scope 
of our work, but existing related work (see the next section and Section 6.4) makes the prospects 
seem encouraging. 

As an additional informal argument to the interest in this abstraction, we include in Section 7 
a few additional examples, collected from previous papers on complexity analysis, that illustrate 
different loop behaviours which are still all captured by our model. 

A comment in order is that practical cost-analysis tools typically generate explicit constants, 
for example, they would generate a bound such as — n + 2 rather than just O(n^). However, 
the real bound may possibly be O(n^), since no tightness is guaranteed. In contrast, we chose 
to relax the expression of the bound to a big-Oh one but we show that the precise degree 
is decidable. Our algorithms can provide explicit constants, but they will be definitely over- 
approximative. Bounds that have precise explicit constants may be computable, too. We leave 
this as a challenge for further research. 

2 Related Work 

There is a surprisingly large body of work related to the topics of this paper. Most pertinent 
is the work in program analysis, directed at obtaining symbolic, possibly asymptotic, complexity 
bounds for programs (in a high-level language or an intermediate language) under generic cost 
models (either unit cost or a more flexible, parametrized cost model). In this section, to put our 
work in context, we cite some of these works and indicate what approaches were employed. The 
first subsection is an overview and cites various approaches. The second one elaborates on the 
works most directly related to ours. There are many other works in this area which have been 
left out; a complete survey would be an article in itself. 

2.1 Approaches in Complexity Analysis 

Seminal works. Wgbreit [Weg75] presented the first, and very influential, system for auto- 
matically analysing a program's complexity. His system analyzes first-order LISP programs; 
Broadly speaking, the system instruments a program to obtain a function that returns the de- 
sired complexity measure, and then attempts to simplify the program until a closed form for the 
function can be found. Possibly, the program becomes a set of recurrence equations for the com- 
plexity which have to be solved. Subsequent works along similar lines included [LM88, Ros89] 
and more recently [BcnOl, Ben04] for functional programs and [DwL93, DLGHL94] for logic pro- 
grams. The latter describe static analyses to deal with complications particular to the semantics 
of logic programs, where programs compute sets of answers and involve backtracking. 

Studies of restricted languages. Our approach in this paper involves the study of complex- 
ity properties of a simplied, abstract program. Research in Implicit Computational Complexity 
(ICC) has produced numerous examples of programming languages that are so restricted that 
they capture an intended complexity class, that is, compute all, and only, functions of that class. 
Early examples include [Cob64, KA80, BC92]. Many of these restrictions (e.g., [Cob64, BC92]) 
may be seen (or are even explicitly presented) as imposing a certain type system on a language 
which, otherwise, could also compute outside the intended complexity class; but this is not an 
automated analysis in the sense that the programmer has to supply the "types" (in [Cob64], 
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and also some later works like [CWOO], these are explicit resource bounds). In these cases one 
might describe the technique more as certification than analysis. However, ICC research has 
also developed some methods that were later put to effective use in automated analysis. Two 
notable examples are the method of term interpretations (see the paragraph on Term Rewriting 
Systems below) , and the method of linear types [Hof03] , which yielded strong analysis techniques 
as described, e.g., in [HHIO, JHLHIO]. 

SPEED is an ambitious project from Microsoft Research to create a complexity analysis tool 
using a variety of techniques, focusing on C programs [GG08, GMC09, GJK09, Gul09, GZIO]. 
In [GG08, GMC09], the essence of the technique is to instrument the program with a counter, so 
that the desired resource usage becomes an output value, and bound this value using invariant- 
generation methods. In [GJK09], the techniques are program transformation (called control-flow 
refinement) and "progress invariants," which are used for obtaining more precise bounds for 
nested loops. In [GZIO], the term reachability bound was coined, which we also employ in this 
paper. 

Abstract interpretation techniques. While abstract interpretation [Cou96] is the de-facto 

standard way of presenting many program analyses, in the realm of complexity analysis its role 
has mostly been confined to supporting analyses (finding the ranges of values etc). As mentioned 
above, complexity analysis is sometimes reduced to computing a bound on computed values, and 
this is done by the traditional kind of abstract interpretation (invariant generation). However, 
there are a few works where abstract interpretations have been developed that directly result in 
complexity properties. In [MPS 10] it was done for space complexity of a functional language. In 
[NW06, JK09], simple imperative programming languages have been analysed for complexity; 
interestingly, because of the background in ICC rather than in static analysis, the terminology 
of abstract interpretation is not used. These works were followed by [BJK08, BAlOa] were it was 
shown that for languages of a similar style (imperative structure, very restricted in the usage of 
data, and non-deterministic in control flow except for bounded loops) , an abstract-interpretation 
based analysis is actually a decision procedure: for example, one can decide whether a program 
is polynomial-time. In this paper, we are also interested in abstract programs whose properties 
of interest are decidable. However, the nature of the abstract programs is very different. 

Term Rewriting Systems are an elementary computational model that may be used to rep- 
resent programs from a variety of source languages. There is already much work on complexity 
analysis for TRSs. We mention two of the directions taken. [HM08b, HM08a, AM09, NEGll] 
employ the dependency pair method, which like the model we are studying, was originally con- 
ceived for termination, and in fact has been effectively combined with size-change termination 
[TG05, GTSKF06, CFGSklO]. 

Another method that has extended its scope from proving termination to proving complexity 
bounds in the context of Term Rewriting Systems is the polynomial interpretation method 
[BCMTOl], later extended to other kinds of interpretation functions [MSW08, MP09, BDIO, 
NZMIO, WallO]. The method has some resemblance to the analysis of transition systems with 
ranking functions, since the value of an interpretation has to decrease as computation progresses, 
but interpretations have a particular structure which is related to the structure of the terms in 
the system. Different interpretation methods have very different structures and it is beyond the 
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scope of this work to survey this Unc of work in greater detail. It should be pointed out that, 
basically, interpretations are proof methods and it is not always clear how to turn them into 
automatic analyses (in other words: how to synthesize suitable interpretations), but this issue 
is discussed in the literature, for example in [Ama05, MSW08] and many others. 

2.2 Analysis of Constraint Transition Systems. 

We have already described [ADFGlOa], where {Ajf, Z)-CTS was used as an abstract program 
and analysed using lexicographic linear ranking functions. 

The COSTA project [AAGPIO, AAG+12] targets symbolic analysis of Java bytecode pro- 
grams. It is a big project, in which involved methods of abstracting the concrete programs 
were implemented, but this is unrelated to our topic. Our interest begins where they reach an 
abstract program representation, which they call CRS (for cost relation system). An example 
of a CRS (hberally modified from [AAGPIO]) is: 



where ki , k2 represents costs (and can be non-constant expressions depending on the variables) ; 
essentially, this can be understood as a non-deterministic sort of recursive program whose result 
is the desired cost bound. As a central part in the algorithm to bound this result, the system 
is simplified to eliminate indirect recursion (which is not possible for all systems, but is argued 
to work well in practice) and then the height of the recursion tree is bounded by looking at 
individual (multiple-path) loops, e.g., all the "calls" from E to E, and finding a linear ranking 
function for each such loop. In a structured program with nested loops, each loop will turn 
into this kind of a recursive cost relation and will therefore have to be bounded using a linear 
ranking function. This implies that a global ranking function of the lexicographic linear kind 
exists, but the technique is more restricted than [ADFGlOa] which finds a lexicographic linear 
ranking function by analysing the transition system globally (that is, the lexicographic structure 
does not have to follow the loop nesting). 

In comparison to our work, it is important to note that affine relations are expressive enough 
to make their termination problem undecidable (the simple argument is that counter machines 
can be represented). Thus, a complete solution cannot be achieved. One could try to relate our 
works by considering (A4C,Z)-CTS as a special case of {AjJ', Z)-CTS; if we do so, we find that 
their solutions do not encompass ours as a special case. Indeed, not every {M.C, Z)-CTS which is 
bounded terminating has a lexicographic linear ranking function (not even systems with a single 
program point). This is probably well known but will also be demonstrated by an example in 
Section 7. 

Monotonicity constraint transition systems. As mentioned earlier, monotonicity con- 
straint transition systems have been first used (with different terminology) for termination anal- 
ysis of logic programs [Sag91, LS97, CT99]. In addition to this successful application, they 
have also been applied in the termination analysis of functional programs [LJBAOl, MV06a, 
Kra07, SJ05] and imperative programs [SMPIO, Ave06, CGBA^ll]. Some works on the theory 
of {MC,Z)-CTS and their decision problems are [CLS05, MT09, BA09, BAlOb, BAll] ; decision 
procedures for extensions of the model have been discussed in [BA08, BP 12]. 



E{a,j) = ki+E{a',j')+F{a,j,j',a') 
F{a,j,j',a') = k2 + E{a,j + l) 



{f =j,a' = a-l,a' >0,j>0} 
{j <a-l,j>0,a-a' = 1,/ = j} 
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While this paper was in preparation, we learnt of the work of Zuleger et al. [ZGSVll], 
who also applied (A^C,Z)-CTS (more specifically, size-change graphs) in the context of cost 
analysis. The cited conference paper does not provide all details, however even a superficial look 
confirms that their use of the abstraction is essentially different from our work, since they do 
not employ an "abstract and conquer" approach where an abstract program becomes an object 
in itself. Instead, the abstraction is just one tool in a complex algorithm that processes source 
programs. 

3 Preliminaries 

The results in this paper build on previous research on the termination problem of {AiC, Z)-CTS. 
To make the paper self-contained, we repeat in this section the basic definitions and certain 
results from previous work. Readers familiar with [BAll] will find little news here but should 
read Section 3.4. 

3.1 Monotonicity Constraint Systems and their semantics 

Definition 3.1. a (A^C,Z)-CTS consists of a control-flow graph (CFG), monotonicity con- 
straints and state invariants, all defined below. 

• A control-flow graph is a directed graph (allowing parallel arcs) over the set F of flow 
points. Every flow-point / is associated with a fixed list of variables'^ The number of 
variables is called the arity of / and may be denoted by ar(f); the variables themselves 
are usually denoted in the text by xi, . . . though in examples we may use other 
identifiers, most naturally the names of variables of the source program. 

• A non-empty set of flow points, Finit G F, is designated as the initial flow points of the 
CFG. 

• Every CFG arc / -> is associated with a monotonicity constraint (MC), being a conjunc- 
tion of order constraints x >- y where x, y G {xi, . . . , x^^^^j^-j, x'^, . . . , x^^^^^}, and >- is either 
> or >; for uniform notation, we also use for > and for >. Note that <, <, = can 
be used as syntactic sugar. 

We write G : f ^ g to indicate the association of an MC G with its source and target 
flow-points. 

A calligraphic-style letter (typically A, for abstract program) is used to denote a {MC, Z)-CTS. 
F'^ (-Pimt) will be its flow-point (initial flow-point) set. A monotonicity constraint will often be 
denoted by G because it is typically represented by a graph (as explained below). However, when 

graph-theoretic notions are applied to A (such as, is strongly connected"), they concern the 
underlying CFG. In the text, a (7\4C,Z)-CTS may be succinctly referred to as "a system" when 
the meaning should be clear. 

^Called parameters or arguments in some publications — depending on the programming paradigm the authors 
has in mind. Similarly, flow points may be called program points or locations. 
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State Invariants. Our representation of a (A^C,Z)-CTS also includes, for each / G F, an 
invariant If, which is a conjunction of order constraints among the variables. An example is 
(xi > X2) A (x3 = X4). It is assumed that these constraints are also included in the MCs entering 
or leaving / (note that for an MC entering /, the variables will be primed, as they belong to 
the target state). This assumption implies that the invariants are only a convenience, a way 
to indicate that some constraints will hold whenever / is visited, irrespective of which of the 
incoming and outgoing transitions are taken. The reader will see later that our algorithms make 
significant use of this information. 

Semantics. Semantically, a (A^C,Z)-CTS represents a transition relation over a set of (ab- 
stract) program states. In a state, every variable has a specific value. In this paper, all values 
are integers (as in [BAll] and unlike [BAlOb, LJBAOl, etc.], which dealt with well-founded 
sets) . 

Definition 3.2 (states). A state of .4 is s = {f,CT), where / G -F"^ and a : {1, . . . , n} — Z 
represents an assignment of values to the variables, where n = ar{f). The state is initial if 

/ ^ -^INIT- 

Satisfaction of a predicate e with free variables xi,...,Xn (for example, xi > X2) by an 
assignment a is defined in the natural way, and expressed by a |= e. If e is a predicate involving 
the n + n' variables xi, . . . ,Xn,Xi, . . . , x'^, , we write a,a' \= e when e is satisfied by setting the 
unprimed variables according to a and the primed ones according to a'. 

Definition 3.3 (transitions). A transition is a pair of states, a source state s and a target state 
s'. For G : f ^ g £ A, we write (/, a), {g, a') |= G if a, a' |= G. 

Note that we may have unsatisfiable MCs, such as xi > X2 f\X2 > xi; our algorithms will 
identify such MCs and ignore them. 

Definition 3.4 (transition system). The transition system associated with A is the binary 
relation 

r4 = {(s, s') \s,s' \=G for some G G A}. 

Note that some authors refer to a program representation as a "transition system." We use 
this term for a semantic object. Our view of a (A1C,Z)-CTS is declarative: a set of constraints 
that describe the transition system 7^. It is also possible to interpret a (A^C,Z)-CTS opera- 
tionally, as a kind of program. Every MC, G : f ^ g, then represents a step that the program 
may take when in program location (label) /. The step consists of non-deterministically choosing 
values for the primed variables such that G is satisfied by the current state plus the chosen new 
values. The new values are then assigned to the variables, and the program location changed 
to g. While we hope that this view may be useful to some readers, our formal development will 
use the declarative viewpoint. 

Definition 3.5 (run, height). A run of 7^ is a (finite or infinite) sequence of states s = 
so,si, S2 . ■ ■ such that for all z > (up to the end of the sequence), (sj_i,Si) G Ta- For a 
finite run sq, Si, S2, ■ ■ ■ ,Si we refer to £ as its length. The height of a state is the length of the 
longest run beginning at the state. 
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Note that by the definition of 7^, a run is associated with a sequence of CFG arcs labeled 
by Gi, G2, . . . where Si |= Gi. This sequence constitutes a (possibly non-simple) path in 
the CFG. As a slight abuse of definition, we may associate the run with A rather than explicitly 
mentioning Ta- 

Definition 3.6 (termination). A transition system is terminating if it has no infinite run from 
an initial state. A {M.C, Z)-CTS A is terminating if Ta is terminating. 

This notion of termination was called rooted termination in [BAll], which also considered 
uniform termination — where reachability from an initial state is not taken into account. In 
the context of work on bounded termination, rooted termination is essential, and therefore the 
unqualified term will refer, in this paper, to rooted termination. 

Definition 3.7 (bounded termination). A transition system satisfies bounded termination if it 
is terminating and the height of every initial state is finite. We say that a (A^C,Z)-CTS A 
satisfies bounded termination if Ta does (we also say that A is bounded-terminating). 

Ben-Amram [BAll] proved that (A^C,Z)-CTS termination is decidable, and, more precisely, 
PSPACE-complete. We shall prove the same for bounded termination. It is important to note 
that a terminating program is not necessarily bounded-terminating, as in the next example. 

Example 3.1. A classic example of termination analysis is the Ackermann function, here in 
pure-functional style: 

ack(m,n) = if m<=0 then n+1 else 

if n<=0 then ack(m-l,l) 
else ack(m-l,ack(m,n-l)) 

The straightforward abstraction to a (A^C,Z)-CTS, has a single-node control-flow graph (the 
node represents the function ack), with three self-loops representing the recursive calls (here in 
the order of the call sites in the program text): 



Note the constraints = 0'; these are included since in our constraint language there is no 
notion of constant. Technically, is a state variable, hence the need for explicitly stating that it 
is constant^. The need for constraints like that also arises because of the "frame problem" (as it 
is called in Artificial Intelligence), that is, the need to state explicitly that variables not affected 
by a transition do not lose their value. In order to make the writing of these constraints more 
concise, we use the notation Same(a;, y, ■ ■ ■) foi x = x' A y = y' A . . . (as in Figure 2). 

Returning to the Ackermann example, it is easy to verify that this constraint transition 
system terminates; in fact, it has the global ranking function (m, n). However, it is not bounded- 
terminating. Indeed, for any (arbitrary large) number A^, it has a transition sequence of length 
A'^ -f 1 from the initial state (1,2): 



in>OAn<OAni>m'An>0'AO 
m>0An>0Ani>m'A0 = 0' 
in>0An>0Am = m'An>n'A0 



0' 



(1) 
(2) 
(3) 



(2,l)^(l,Ar)^(l,Ar-l)^...^(l,0) 



^Constants can also be explicitly added to the constraint language, see [BP12]. 
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m -c m' 

o' 

Gi:w— >i G2:i— >w Gs:!— >w 

Figure 3: MCs as graphs. The left-hand side is the source. Broken arcs are non-strict, sohd arcs 
are strict. 

The concrete program is, of course, bounded-terminating, because it is deterministic. Thus the 
length of the run is a function of the initial state. This information is lost because the abstraction 
is non-deterministic, and only super-approximates the semantics of the concrete program. To be 
more precise, it is the fact that we have unbounded non-determinism that causes the problem; if 
the abstraction had been non-deterministic, but finitely branching, by Konig's lemma it would 
still be bounded-terminating. 

3.2 MC graphs and multipaths 

It is convenient for reasoning, and practical for algorithms, to represent MCs as directed graphs. 
These graphs have nodes xi, . . . . . . ,x'^, for the appropriate arities n,n' and represent 

each relation x y by an arc; an arc representing a strict inequality is called a strict arc. A 
path in the graph is called strict if it includes at least one strict arc. 

Standard graph algorithms can be used to perform operations such as path-finding and 
ensuring that the representation is closed under logical consequence, which is a simple reachability 
closure in the graph. In the process, we also identify (and remove) unsatisfiable MCs. Clearly, 
an MC is unsatisfiable if and only if there is a strict cycle. 

Example 3.2. Figure 3 shows MCs extracted from the program below. The flow-points are w 
(entry to the while command) and i (entry to the if statement). 

while (m<n) 

if (m>0) n := n-1 
else m := m+1 

Some publications use the term MC graph (MCG); we, however, identify an MC with its 
graph representation. This should not cause any problems. We also use set notation, such as 
(x > y) G G. We define the notation (x, y) G G to mean that x and y are related in G (without 
indicating the relation, which may be >, >, < or <). We employ the same notations with respect 
to state invariants, e.g., (x > y) G //. Note that the above example does not have any state 
invariants (in the given abstraction), because the transitions that enter and exit each point do 
not agree on any relation among the state variables. 

Notation. Whenever graphs are considered, the notation v means that there is a path from 
u to V. The notation p : v names the path. 
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[0,1] — :^x[i,i] — :^x[2,i] — :^x[3,i] 

\ ) \ 

[0, 2] > x[l, 2] x[2, 2] > x[3, 2] 



x[o, 3] tzZy x[i, 3] ^[2, 3] a; [3, 3] 

Figure 4: A multipath. 

Definition 3.8 (multipath). Let ^ be a (A^C,Z)-CTS. Let /o,/i,/2, • • • G F"^ be a (finite or 
infinite) list of flow-points connected by MCs Gt : ft-i ft (clearly, this constitutes a path in 
the CFG). The m,ultipa,th M that corresponds to this path is a (finite or infinite) graph with 
nodes where t ranges from up to the length of the path (which we also refer to as the 

length of M), and 1 < i < ar{ft). Its arcs are obtained by merging the following sets: for 
all t > 1, M includes the arcs of Gt, with source variable Xi renamed to x[t — and target 
variable x'j renamed to x[t,j]. 

The multipath may be written concisely as G1G2 . . . ; for example, Figure 4 illustrates a mul- 
tipath G2G1G2, based on the MCs from Figure 3. The term multipath (originating in [LJBAOl]) 
hints at the multiple paths that may exist in the graph representation of M (the importance of 
these paths is further discussed below) . We use the expression ^-multipath when it is necessary 
to name the CTS that M is formed from. 

If Mi,M2 are finite multipaths, and Mi corresponds to a CFG path that ends where M2 
begins, we denote by M1M2 the result of concatenating them in the obvious way. The notation 
M : f ^ g indicates the initial and final flow-points of M. 

Clearly, a multipath can be interpreted as a conjunction of constraints on a set of variables 
associated with its nodes. We consider assignments a to these variables, where the value assigned 
to x[t,i] is denoted a[t,i]. 

A multipath may be seen as an execution trace of the abstract program, whereas a satisfying 
assignment constitutes a (concrete) run of 7^. Conversely: every run of Ta constitutes a satisfy- 
ing assignment to the corresponding multipath. Multipaths that start at an initial flow-point are 
called rooted. Termination can thus be expressed as non-existence of satisfiable, rooted infinite 
multipaths. 

As for single MC graphs, we have 

OBSERVATION 3.9. A finite multipath is satisfiable if and only if it does not contain a strict 
cycle. 

We next consider down-paths and up-paths. The definition of a down-path is just the 
standard definition of a graph path, but it is renamed in order to accommodate the notion of 
an up-path. 

Definition 3.10. A down-path in a graph is a sequence (vq, ei,vi, 62, V2, . . ■) where for all i, Cj 
is an arc from Vi-i to Vi (in the absence of parallel arcs, it suffices to list the nodes). An up-path 
is a sequence {vq, ei,vi, 62, V2, ■ ■ .) where for all i, Cj is an arc from Vi to Vi-i. 

The term path may be used generically to mean either a down-path or an up-path (such 
usage should be clarified by context). 
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Semantically, in an MC or a multipath, a down-path represents a descending chain of values, 
whereas an up-path represents an ascending chain. Note also that an up-path listed backwards 

is a down-path in the transposed graph. 

Definition 3.11. Let M = G1G2 ■ ■ ■ be a multipath. A down-thread in M is a down-path that 
only includes arcs of the form {x[t,i] — )• x[t + 

An up-thread in M is an up-path that only includes arcs of the form {x[t, i] x[t + 

A thread is either. 

Definition 3.12 (cyclic). We say that a transition, or a multipath, is cyclic if its source and 
target flow-points are equal. 

The next lemma and the following definitions, all from [BAll], are only used in Section 5. 

LEMMA 3.13. // a strongly connected {AiC,Z)-CTS satisfies SCT, every finite multipath 
includes a strict, complete thread. 

Definition 3.14 (composition). The composition of MC Gi : / — )• 3 with G2 ■ g ^ h, written 
Gi; G2, is a MC with source / and target h, which includes all the constraints among s, s' implied 
hy3s'' :s,s" \=GiAs",s' \=G2. 

Definition 3.15 (collapse). For a finite multipath M = Gi . . . G£, Let M = Gi; • • • ; G^. This 
is called the collapse of M. 

Definition 3.16 (reachability). A flow-point / G -F"^ is reachable if there is a satisfiable finite 
multipath M : f such that /o is initial. 

Definition 3.17. Given a {MC, Z)-CTS A, its closure set cl{A) is 

{M I M is a satisfiable finite .4-multipath starting at a reachable flow-point}. 

3.3 Stability 

Definition 3.18 (stability). A {MC, Z)-CTS A is stable if (1) all MCs in A are satisfiable; (2) 
in the CFG of A, all flow-points are reachable from an initial flow-point; (3) to every / G -F"^ is 
associated an invariant If such that for all G : f ^ g in A, {xi y Xj) EG {xi y xj) E If, 

similarly, (x^ >~ x'j) & G <^=^ {xi >~Xj)Elf. 

LEMMA 3.19. [BAlOb] Suppose that {MC,Z)-CTS A is stable. Then every finite multipath 
is satisfiable. 

Note that in stable systems the flow-point invariants play an essential role since they are 
supposed to contain all the information that can be deduced from the adjacent MCs. The 
process of stabilizing a (A^C,Z)-CTS involves splitting flow-points in the CFG whose original 
invariants were not precise enough. Algorithms for stabilization are described in [BAlOb]. Such 
an algorithm transforms a {MC, Z)-CTS A into an equivalent stable system, which we denote 
by S{A) ("equivalent" means that they have the same runs, up to renaming of flow-points or 
possibly variables). We say that S{A) is a refinement of A, since it explicitly separates states 
that in A are not explicitly separated. 
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w : < m < n 
i : m < n, m < 



Gi 



w : m < n, m < 




i : < m < n 



Ga 



>w:m>n, m>0 



w : n < m < 



Figure 5: A stabilized CFG. 



Figure 5 shows how the CFG of Example 3.2 (which originally had two nodes) is transformed 
by stabilization. The w node has been split in four and the i node in two. There are also several 
CFG arcs that represent the same original transition, for example Gi appears twice. The MCs 
annotating these arcs will not be identical to Gi, since the source and target invariants are 
merged into each MC. Note also that there are now several initial flow-points (namely all the 
nodes labeled (w)). 

In the worst case, such a transformation can multiply the size of the system by a factor 
exponential in the number of variables n (bounded by the Ordered Bell Number Bn which is 
between n! and ^^'^nl [Slo, Seq. A670]). 

3.4 Full elaboration 

Full elaboration [BAlOb] may be seen as a brute-force way of obtaining a stable version of a 
program. The key observation is that for a finite number of variables, there are only finitely 
many orderings of their values. It is thus possible to exhaustively list all possibilities and create 
an explicit representation of how transitions will affect each one. This transformation is useful 
for proving that certain problems are decidable in polynomial space, since one does not actually 
need to list all possibilities; they can be created "on the fly." This should become clear from the 
explanations below. 

In this paper we make subtle adjustments to the presentation of full elaboration in [BAlOb], 
which are important for its use in Section 5. 

Definition 3.20 (full elaboration). A (A^C,Z)-CTS A is fully elaborated if the following con- 
ditions hold: 

(1) Each state invariant fully specifies the relations among all variables, so that they are in 
ascending order by index. Namely, for all i < j < ar{f), If includes Xi < Xj. 

(2) Each MC is satisfiable. 

(3) In the CFG of A, all flow-points are reachable from an initial flow-point. 
Indexing the variables in sorted order has some convenient consequences. 

LEMMA 3.21. In a fully- elaborated system, every MC, G, has the downward closure property.' 
for all k < j , (xiy'^Xj) G G entails {xiy'^x'/^) G G, where b > d (that is, the latter relation is at 
least as strict). 
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Since equalities are not allowed in the state invariants, creating a fully-elaborated version of 
a given (A^C,Z)-CTS may require the coalescing of variables related by an equality constraint. 
Otherwise, it is a process similar to stabilization, as described previously. 

Definition 3.22. For a (A^C,Z)-CTS A, its full elaboration E(A) is a fully-elaborated system 
that simulates A in the following sense (see also example below): 

• A flow-point of E{A) is specified by a pair {f,ip) where / is an ^ flow-point and tpf : 
[l,ar(/)] [1,A;] maps variables of / to variables of {f,il^), whose arity k lies between 1 
and ar{f). This mapping ijj is required to be surjective, but is not necessarily injective; 
thus, two A variables may be coalesced in a corresponding flow-point of E{A). The variable 
mapping, and the invariant of (/, are consistent with the invariants of / (in particular, 
variables are coalesced if only if they are constrained by an equality). 

• Given two such points (/,'(/') and {g,ilj'), and an MC G : f ^ g from A, there is a 
corresponding MC G' : {f,tp) — >■ {9,'<p') which is obtained by renaming the variables in 
G according to ^, ip' and adding the elaborated flow-point invariants, provided that the 
result is satisflable. 

• An initial point of E{A) is any point (/, ip) such that / is initial in A. 

• E{A) contains all possible combinations (/, i/j) as above, that are reachable from an initial 
point. 

Example 3.3. Consider the system of Example 3.2. One of the initial flow-points of the elaborated 
system will be (w, [0 i-^ l,in 2,n i-> 3]) (we are using identifiers for the variables at w, for 
legibility). This flow-point represents all initial states in which < m < n. A transition, 
corresponding to Gi, takes this flow-point to a flow-point corresponding to i and having the 
same variable mapping, since the order among the values is unchanged. 

Consider now the elaborated flow-point (w, [0 l,m l,n 2]). It represents states 
in which = m < n. A transition, corresponding to Gi, takes this flow-point to a flow-point 
corresponding to i and having the same variable mapping. Prom this flow-point, there will 
be an out-going transition representing G3, but no one representing G2, since that would be 
unsatisfiable. (End of example) 

Note that given two elaborated fiow-points (/, i/j) and {g, and an MC G : f ^ g from A, 
a simple polynomial-time procedure computes the elaborated transition Ge '■ {f,'4') — ^ (fl'^V'')' 
or possibly rejects it as unsatisfiable, assuming (and not verifying) that (/, ip) is itself reachable 
in E{A). This is why it is not necessary, in the algorithms where we shall use full elaboration, 
to pre-compute a full representation of -^(.4) and keep it in memory. 

When presenting an algorithm that processes a fully-elaborated transition system, we will 
not use the notation (/, tp) for flow-points since we do not care about the correspondence with 
the original system. We will use simple identiflers instead. 

4 The Bounded Termination Problem 

This section gives our first theoretical result: decidability and complexity of the bounded ter- 
mination problem, and the corollary that height bounds are polynomial. 
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4.1 Discovering Bounded Variables 

To establish bounds on transition-sequence length we need bounds on the values of variables 
throughout the execution, in terms of the initial values. So, we are looking for invariants of the 
kind Xi < Xj where Xj is the initial value of Xj. The inequality relates values at two different 
points in execution, not a property of a state, which can be captured by a state invariant. This 
apparent difficulty is easily solved by instrumenting the program. Specifically, we make a copy 
of the initial variables. The copies are never modified but carried over to every subsequent state 
and turn the relationship of current values to initial values into a property of states. In this 
paper we will, for simplicity, create only two such variables: Xmax to represent the maximum 
among initial values, and Xmin to represent the minimum. This will allow us to determine 
whether a subsequently-computed value is upper-bounded by at least one initial value (which 
is the same as being bounded by Xmax) or lower-bounded by at least one initial value (same as 
lower-bounded by Xmin)- Note that this instrumentation is part of the algorithm whose input 
is the constraint transition system; we do not deal with concrete programs. We find it more 
legible to avoid using numeric indices for these variables, though technically they will just be 
Xn+i and Xn+2 where n is the original arity. 

Definition 4.1. For a given (7WC,Z)-CTS A, the instrumented version I{A) is obtained by the 
following steps. 

(1) Add two new variables Xmax, Xmin to every flow-point. 

(2) Add a new initial point /o with an invariant If^ that expresses the intended relationship of 

Xmax and x„iin to the initial value of xj for 1 < j < ar{fo), namely x„iax > xj and Xmin ^ xj. 

(3) Add a transition from /o to each of the original initial points, with constraints Xi = x'^, 
for all i, in addition to constraints inherited from Ij^. (4) Add constraints Xmax = x'^^x 
Xmin = x'^^^ to all transitions. 

The reader may note that the constraints cannot express that Xmax is precisely the maximum 
among initial values — but the effect is the same. If for some flow-point we can deduce the 
invariant Xmax ^ Xj^ then Xj must be bounded by one of the initial values — since Xj is related 
to 

Xmax only by paths passing through xi, . . . , x^r^fo) at /q. More formally: 

LEMMA 4.2. Let M he a rooted multipath of I{A). Suppose that M is satisfiable. Then 
there is a satisfying assignment for M such that a[0, ?71(2x] (the ClSSigflTTlCTlt of Xffidx 

the initial 

flow-point) is exact/y maxi<i<„ (t[0, i]; and a[0,min] is exac% mini<i<„ a [0, i]. 
Such an assignment will be called tight. 

The next step will be to compute the stable program S{I{A}). Then we proceed to identifying 
bounded variables. To see why stabilization is necessary, consider the program in Figure 6, shown 
together with its control-flow graph. The notation x:=* represents the assignment of a value 
unrelated to the program inputs. 

It is easy to see that at point W, there is no invariant that bounds one of the variables (or 
both) in terms of the input values. The closest we might come is to establish a disjunctive 
invariant of the form "either the value of x or the value of y is bounded by the input b," but 
such an invariant is not useful for our approach, as will be seen below. Stabilization solves this 
problem: in a stable system, each of the possible cases (x < b and y < b) is represented by a 
distinct flow-point. 
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input a , b 

if (*) then x := b; y 
else y := b; 
while ( x>a and y>a ) 
X := x-1; y : 



Figure 6: A program and its CFG (with flow-points for the If, the Then and Else branches, and 
the While header). Which variable is bounded at W? 

Definition 4.3. For every flow-point / of S{I{A)), let 

B{f) = {j I {xj >-'' Xmin), {xmax ^'^ Xj) G // for some b, d} . 
We call B{f) the set of bounded variables at /. 

4.2 Deciding Bounded Termination 

We next provide a decision algorithm for bounded termination. The idea, in a nutshell: ignore 
all non-bounded variables, and check for termination. The example given earlier illustrates that 
the role of stability in the definition of B is crucial for the correctness of this algorithm. 

If M is a multipath, a B-restricted assignment for M is an assignment a to the variables of M 
that assigns integers to the bounded variables and the special value _L to all others. This value 
satisfies any constraint (including _L > _L) which means that the non-bounded variables do not 
influence the satisfaction of constraints. If such an assignment exists, M is called i3-satisfiable. 

Definition 4.4. We say that S{I{A)) B-terminates if there is no infinite, rooted multipath M 
which is i?-satisfiable. 

Algorithm 4.1. (Bounded Termination) Input: {MC, Z)-CTS A. 

1. Build S{I{A)). 

2. Perform a decision procedure for termination on S{I{A)), taking only bounded variables 
into account. 

3. Return the result of the termination procedure. 

Clearly, the algorithm checks for i?-termination of S{I{A)). We next prove that this is a sufficient 
and necessary condition for bounded termination. 

THEOREM 4.5 (extended soundness). IfS{I{A)) B-terminates, then A bounded-terminates. 
Moreover, let maxx' and minx' the maximum and minimum values among the variables of the 
initial state. The height of the initial state is ©((maxx' — minx'')"), where the constant factor 
depends on the size of S{I{A)), and n is the maximum of the flow-point arities in A. 
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Proof. The fact that A is terminating is obvious — B termination is a weaker notion than termina- 
tion. To justify bounded termination, consider any rooted S'(/(^))-multipath, and a satisfying, 
tight Z?-restricted assignment. AU bounded variables will be assigned values between minx' 
and maxx''. The variables x^ax and Xmin are constant throughout. Thus there are at most 
m(maxx' — minx')" different states that can potentially appear in a satisfying assignment to 
this multipath, where m is the number of flow-points in S{I{A)). There can be no repeated 
states, since otherwise one can use an obvious "cut and paste" argument and exhibit an in- 
finite B-satisfiable multipath. We conclude that the height of the initial state is bounded by 
m(maxx'' — minx')"'. □ 

To prove completeness, we use the following fact [BAll]. 

LEMMA 4.6. If A is a non-terminating {M.C,'Z)-CTS with initial point fo, there is a cyclic 
multipath L : f ^ f , and a rooted multipath H : fo ^ f, such that HL^ (H followed by an 
infinite sequence of L's) is satisfiable. ^ 

And we add a new lemma. 

LEMMA 4.7. Let M be a finite multipath in S{I{A)) and a a B-restricted assignment for M. 
It is possible to extend a to an assignment a' that satisfies M. 

Extending a simply means assigning integer values to the variables left undefined (_L) by a, 
which are the non-bounded variables appearing in M. 

Proof. We treat M as a directed graph, with arcs weighted by for a non-strict arc and —1 for a 
strict one. Since M is satisfiable (Lemma 3.19), there is no negative- weight cycle; hence, for all 
nodes u and v, if v is reachable from u, there is a minimum-weight path from u to v. We define 
the minimum-weight distance S{u, v) to be the weight of such a path, or -|-oo if v is unreachable 
from u. 

We define sets of nodes Uq, Di,Ui, D2,U2 . . . as follows: 

• Uq consists of all nodes which represent bounded variables. 

• For all i> 0, let P^^^ = UqU DiU ■ ■ ■ UUi; then, Di^i is the set of nodes v ^ P^^^ such 
that V for some u G P^i . 

• For alH > 1, let if^ = ?7o U Di U • • • U A; then, Ui is the set of nodes u ^ such that 

V for some v G P[^ . 

We extend a from the nodes of Uq, on which it is initially defined, to nodes of every set 
Di and Ui, inductively. Suppose that Uq, Di, . . . ,Ui have already been treated; let P^i be the 
union of these sets. Then for every v G -Dj+i we extend a by letting 

a{v) = min {a{u) -\- 5{u,v)} 

Note that cr(v) is finite since, by definition of -Dj+i, there are nodes in P^^i such that d{u,v) is 
finite, and they arc already assigned. 



''Actually, the corresponding lemma in [BAll] does not consider rooted termination and therefore neglects the 
stem H. But if the termination test is modified to test for rooted termination (so that only reachable cycles are 
considered), the lemma stated here ensues. 
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Alternatingly, we extend a to Ui, assuming that all nodes in P^^ = C/q U Di U • • • U have 
been assigned. For u e Ui we let 

ct{u) = mayi{a{v) — 6{u,v)} 

As above, cr{v) is well-defined and finite. 

We claim that the assignments to a are consistent with the constraints in M. To prove this, 
consider an assignment to v G -Dj+i- There are three possible types of constraints involving v 
and another assigned variable: 

(1) A constraint v v' with both v and v' in Z?j+i. Thus, both are reachable from P^^, 
and, by the definition of 5, we have for all u G -Pj+i, S{u,v) + b > 5{u, v'). In particular, choose 
ul € P^i such that a{uL) + d{uL,v) is minimum (and, hence, this is the value assigned to v); 
then 

a{uL) + 5{uL,v') < a{uL) + S{ul,v) + b 
so our definition of a{v) and cr{v') satisfies 

a{v') < a{v) + b 

and the constraint is satisfied. 

(2) A constraint v v' with v in -Dj+i and v' in P^i- Here the case i = is special. So 
consider first i > 0. By examining the definition of the sets, the reader may verify that P^_i 
is closed under reverse-reachability. Hence, our assumptions imply v G -Pj^i, and v G -Di+i is 
impossible. Next, let z = 0; so v is in Di and v' in Uq which is the set of bounded variables. By 
the definition of Di, there is a Uq variable which upper-bounds v, while v' lower-bounds it, so 
V too is a bounded variable and cannot be in Di. 

(3) A constraint v' v with v in .Dj+i and v' in P^li- Clearly, 

min {(j{u) + S{u,v)} < cr{v') + b 

SO this constraint will be satisfied. 

A similar case analysis justifies the assignments in Ui. 

Finally, there may remain nodes not in any of the above sets. These nodes are not connected 

to any node already assigned. So an assignment may be chosen for them freely, only having to 
satisfy relations among themselves, which is possible since M is satisfiable. □ 

THEOREM 4.8 (Completeness). A {MC,Z)-CTS A bounded-terminates only if S{I{A)) B- 

terminates. 

Proof. Suppose that S{I{A)) does not i3-tcrminatc. Our goal is to show that for a particular 
initial state, there is a run, starting at this state, of any length. 

We use Lemma 4.6. It provides us with a cyclic multipath L and a rooted multipath H, 
such that HL'^ is 5-satisfiable, say be a. For every p>0, multipath HL^ is 5-satisfied by the 
corresponding part of a. Since it is a finite multipath in a stable system, by Lemmas 3.19 and 
4.7 we can extend a to a complete assignment ap that satisfies HI/'. 

Next, we note that all the variables of the initial point /o of 1(A) are clearly bounded, which 
means that a valuates them. So, all the assignments ap agree on the initial state. This concludes 
the proof. □ 
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Finally we consider the complexity of the decision problem. 

THEOREM 4.9. Deciding whether a {MC, Z)-CTS A bounded-terminates is PSPACE-complete 
(and is PSPACE-hard even for stable systems that have a single flow-point). 

Proof. Upper bound. The algorithm as described calls for constructing S{I{A)) and stabiliza- 
tion may, in general, increase the size of a (A1C,Z)-CTS exponentially. However, as shown 
in [BAlOb], it is possible to implement the decision procedure for termination (or, more pre- 
cisely, for non-termination) as a non-deterministic PSPACE algorithm. The problem is then in 
PSPACE thanks to Savitch's theorem. The trick is to use full elaboration (which yields a stable 
system) . 

Given the (A^C,Z)-CTS program A, our algorithm constructs flow-points and transitions of 
the elaborated system £ = E{I{A)) on the fly. First, it non-deterministically walks through £ 
to find a reachable flow-point / that it guesses will start the loop (the part denoted above by L) . 
From that point on, it maintains a summary of the multipath traversed, proceeding with the 
random walk through £ until a a counter-example to bounded termination (a cyclic multipath 
which is not B-terminating) has been found. 

Lower bound. We reduce from the SCT problem [LJBAOl], a simple case of (yV(C,Z)-CTS 
termination. In an SCT instance, the only type of constraints that appear in the input is Xi >- x'j 
(a source variable bounds a target variable). 

Let S be an SCT instance, with a single flow-point and with n variables. Add variables xj, 
(bottom) and Xf (top), and the constraints: Xf > Xi > Xb, for every i. To every MC add the 
constraints X5 = .x^ and xt = x'^. We claim that the resulting system. A, bounded-terminates 
if and only if S terminates. Indeed, it is obvious that we made all the variables of S bounded. 
So if S terminates, A satisfies the condition for bounded-termination. For the other direction, 
suppose that S does not terminate: In [CLS05] is is shown, that in such a case, there is a loop 
in Ts- That is, there is a run which reaches a certain state s and then repeats forever a certain 
finite run from s to s. Such an infinite run clearly refutes bounded termination. 

This reduction proves the hardness result, since the SCT problem is PSPACE-hard even for 
instances with a single flow-point [BA09]. It is easy to verify that the (A1C,Z)-CTS created is 
also stable. □ 

5 The Reachability-Bound Problem 



This section presents our second theoretical result: 
bound problem, defined more precisely below. 

5.1 Problem Definition 



decidability and complexity of the reachability- 



The reachability-bound problem is defined by Gulwani and Zuleger [GZIO] as the problem of 
computing a worst-case symbolic bound on the number of visits to a given control location. They 
discuss practical motivations for evaluating bounds for specffic flow-points, rather than a global 
bound on the length of runs. If the property of interest is the total length of a computation, it 
is possible to a bound by computing the RB bound for selected cut points in the program (for 
a "big O" bound, it suffices to ensure that every cycle includes a cut point). Therefore, the RB 
problem subsumes the problem of computing height bounds (as posed in the previous section). 
The RB problem is more general, since different flow-points may have different bounds. 
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We refine the definition for our specific setting as follows. 

Definition 5.1. Let .4 be a {M.C, Z)-CTS, and / G F"^. Consider the instrumented program 
I{A) and let /o be its initial flow-point. 

The reachability bound (RB) for / is a function : N — t- N such that Tf{N) is the maximal 
number of occurrences of the flow-point / in a run that begins with a state (/o, a), that a{xmin) = 
and the value a{xmax) = The unrooted reachability bound is deflned analogously, however 
for runs beginning at /. 

The reachability bound problem is to explicitly find a function P such that Tf{N) above is in 

e{p{N)). 

The decision problem RBD (reachability-bound degree) is the set of tuples {A, f, k) such 
that Tf{N) G CI{N'^). The unrooted problem URB is defined analogously. 

Remarks: 

1. If the number of visits to / is unbounded, the reachability bound is undefined. We point 
out below how boundedness can be verified, but otherwise we assume, throughout this 
section, that a bound is known to exist. 

2. The choice of (0, N) as the initial values for {Xmin, Xmax) is not restrictive; any initial state 
with Xmax — Xmin = N would give rise to the same runs, up to a shift in the values, but 
the expression of the bound would be more cumbersome. 

3. We shall show that the RBD problem is in PSPACE. This means that one can also search 
for the tightest degree in polynomial space. We will further prove that once the tight 
degree is found, one actually has Tf up to a constant factor. Since O, Q, and G play an 
essential role in this section, let us recall the definitions: let F, G : N ^ N. 

F G 0(G) ^ (3c > 0)(3d > 0)(Vx)F(x) < cG{x) + d ; 
F e n{G) ^ (3c > 0)(3d > 0){yx)F{x) > cG{x) + d ; 
Q{G) = 0{G)C^^{G). 

We will commit the common abuse of language and write "terminate in 0{n) steps" instead 
of "terminate in S{n) steps for S G 0(n)." 

For our analysis, it is convenient to pose the question in an inverted manner. Instead of 
asking for a bound in terms of N, we ask for a value of N that gives rise to a certain number of 
visits. 

Definition 5.2. Let M be a multipath of I{A). We say that it is A?^-satisfiable if it has a 
satisfying assignment with initial values (0, A?^) in {xmin,Xmax)- 

We formulate a simple lemma: 

LEMMA 5.3. Let Tf{N) be the reachability bound for f. Function /3 : N -)• N satisfies P <Tf 
if and only if every multipath that starts with fo and visits f at most /3{N) times is N -satisfiable. 

The words "at most" can be omitted in the lemma, since if multipath M is AT-satisfiable, 
then so is every prefix of M. Another important observation for our algorithms is contained in 
the following lemma: 
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LEMMA 5.4. A multipath M of I (A) is N -satisfiable if and only if it is satisfiable, and in the 
weighted-graph representation of M , every directed path from Xmax io Xmin has weight at least 
i-N). 

Proof. If N is not satisfiable, it is clearly not A^-satisfiable. If it is satisfiable, but there is a 
path from Xmax to Xmin of Weight below {—N), this implies Xmin < Xmax — N and contradicts 
A?^-satisfiability. Thus, the condition given in the lemma is necessary. 

To show sufficiency, let M be a multipath satisfying both conditions. We construct an 
assignment cr for M by first defining cy{x„iin) = (in all occurrences of x„iin) and (^{xuiax 

) = N. 

Note that if any other variable v is related to both Xmin and Xmax, it must be lower-bounded 
by Xmin and upper-bounded by Xmax- Let B{M) be the set of such variables of M. For each 
V G B{M), define cr{v) = N + w{v) where w{v) is the lowest weight of a path from Xmax to v. 
Clearly, cr(f ) will always be a number between and N, and it is also easy to see that it satisfies 
any constraint involving the variables of B{M). Based on the fact that M is satisfiable (and 
hence includes no negative- weight cycles), it is possible to extend the assignment consistently 
to the rest of the variables as done in the proof of Lemma 4.7. □ 

5.2 Set-up for the algorithms 

We present algorithms to decide RBD, and solve the RB problem, for the constraint transition 
system £ = E(I[A)), obtained by fully elaborating the instrumented version of the input system 
A. Note that a solution to the RB problem for £ implies a solution for the original system 
(with due attention to the fact that several flow-points in £ represent a single one in A; one has, 
therefore, to sum their bounds). 

LEMMA 5.5. In £, the reachability bound for f differs from the unrooted reachability bound 
for f by at most a constant factor. 

Proof. Let Tf be the RB for /, and let Tj be the unrooted reachability bound. As / is reachable, 
every multipath M from / can be extended to a multipath M' from /o by the adjunction of a 
fixed finite prefix H, leading from /o to /. It follows from Lemma 5.4 and Lemma 3.19 that if 
M is AT-satisfiable, then M' is N + h satisfiable where h is a constant depending on H. The 
number of occurrences of / in M and M' is the same. Hence, Tf{N + h) > TJ-{N). 

Conversely, every multipath M from /q can be trimmed at the beginning so that it begins 
with /. If M was A^-satisfiable, it is also A^-satisfiable after trimming. Hence, Tj(N) > Tf[N). 
Using the fact that Tf has polynomial growth, we conclude that e ©(?/). In fact, G 

(i + o(i))r;. □ 

We have thus reduced the general problem to the unrooted one. Next, we develop a decision 
procedure for the URB problem. For our next steps, we assume that we are given £ and the 
degree k. Our goal is then the following: given £, f and k, we wish to determine if the unrooted 
RB for / in £^ is in r2(iV^). We consider A, £ and / fixed up to Section 5.6. 

5.3 Deciding URB 

We describe a decision procedure for URB which is based on inspecting the cyclic transitions 
G : f ^ f that appear in the closure cl{£). Thus, we reduce the problem to one that involves a 
degenerate CFG with a single flow-point. This is sound by the following easy lemma: 
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LEMMA 5.6. LetTJ{N) he the unrooted reachability hound for f . Let cl{£) be the composition- 
closure of £; let TJ(N) be the maximal length of a run of cl{£) that uses only the flow-point f 
and starts with a state where {xmin,Xmax) = {0,N). Then Tj{N) G Q{T'j{N)). 

The justification for tlie lemma is tfiat tlie closure is finite, and every transition in the closure 
represents a fixed, finite sequence of transitions of £. 

Since £ is stable, we can define the subset B of bounded variables as in Section 4. Then, 

in looking for the reachability bound, we can ignore unbounded variables, as Lemma 4.7 shows 
that the satisfiability of finite multipaths is only affected by the bounded variables. 

Definition 5.7. cl{£) \f is the subset of cl{£) that consists only of transitions f f, and where 

the unbounded variables are omitted. 

We are thus going to analyze cl{£) \f. Note that cl{£) \f itself is a fully-elaborated transition 
system, where we define / to be initial. For convenience, we denote the bounded variables of 
cl{£)\f by xi,...,Xn, in the order determined during full elaboration (thus If is Xi < X2 < 

■ ■ ■ < Xn). 

Note that, based on the previous section, the existence of a reachability bound for / may be 
verified by simply checking cl{£) \f for termination. We remark that since it is a fully-elaborated 
system with a single flow-point, the termination test is simpler than for a general {MC, Z)-CTS; 
it is actually polynomial-time [BAll]. 

Definition 5.8. Let < L < n. A level partition of depth L and width w, denoted CV, 
consists of a disjoint partition of {1, . . . , n} into intervals VIi, . . . , VIuj, where VIi = {1, ...,ii}, 
Vl2 = {ii + l,...,i2}, ...,VI^ = { 

iw—i + 1) •••)^}) and every interval is associated with a level 
from to L, so that every level, except possibly level 0, has at least one associated interval, and 
adjacent inter als have different levels. 

In addition, every variable Xi is associated with a direction di G {—1, !}■ 

Informally, we may view each interval VIj as a set of variables rather than indices of variables. 
We use iv{i) for the index of the interval to which Xj belongs. We use level{i) for the level 
associated with the interval iv{i). 

For a binary relation D>, and an integer exponent d the notation has a standard meaning; 
in what follows we only use the cases >^ {= >) and l>~^ (the inverse of >). 

Definition 5.9. Let G G cl{£)\f. We say that G is consistent with CP at level /t, where 
< /i < L, if for every 1 < i < n we have: 

level{i) >h ^ {xi >'^* x'J G G 

level{i) = h ^ {xi >'^* x-) G G V (x^ >'^' x-) G G 

level{i) < h =^ {xi,x[)^G 

and 3i : level{i) = h A {xi >'^^ x-) G G. 

An CP -consistent set of MCs is an indexed set Q = {Gi, . . . ,Gl} such that each Gh is 
consistent with jOV at level h. 

The conditions listed in the definition may be verbally referred to by saying that G is still 
at levels above h, active at level h, and disconnected at levels below h. 
We can now state the essence of the decision procedure: 
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THEOREM 5.10. The reachability bound for f in cl{£)\f is in Q{N^) if and only if an 
>C7^-consistent set of MCs exists, for some CP of depth L. 

For intuition, observe that if such MCs exist then, for the subsystem of cl{£)\f that con- 
sists only of these MCs, we have a lexicographic ranking function p (with codomain [0, A'^]''") 
constructed as follows: let |a;|_i = x — Xmin and |x|+i = Xmax — x. Then 

p{xi,...,Xn) = {pl{x), PL-l{x), . . . , Pl{x)) 

where 

Ph{x)= Yl i^'i*- 

level(i)=h 

Therefore, for this subsystem, an 0{N^) upper bound is clear. We still have to prove that it is 
tight by proving an Q.[N^) lower bound: this will justify the if part of the theorem. To establish 
the only if part, we later show how the existence of a level partition and a consistent set of MCs 
follow from a lower bound on the RB. 

5.4 Level partition implies a lower bound 

Let Q be an CV-consistent set of MCs, as specified in the above theorem. Let M he a Q- 
multipath (at this stage, unspecified). We associate the nodes (variables) of M to intervals and 
levels according to the given level partition. 

LEMMA 5.11. No G € Q contains an arc of the form Xi — )• x'j, or x\ Xj, where iv{j) > iv{i). 

Proof. Suppose that Gh G G contains an arc Xj — > x'j (the case x'^ — > Xj is symmetric). Such an 
arc implies the constraint 

Xi > x'j . (4) 

Due to the upward-indexing in full elaboration, and since iv{j) > iv{i) implies j > i, we have 
Xj > x'-. Hence, (.Xj > x[) G G. Thus, h = level{i). Now, from j > i, wc also have xj > Xi, and 
hence (by (4)) {xj > x'j) £ G. This implies h = level{j), that is, level{j) = level{i). However, 
i and j are not in the same interval. This means that there is a j' such that i < j' < j and 
level{j') 7^ h. Prom (4) and the upward- indexing, we obtain Xi > x'j, and consequently x'j > xy, 
implying level{j') = h, sl contradiction. □ 

Definition 5.12. A block in M is a maximal connected set of nodes that belong to the same 
interval. The level of a block is the level of the interval. 

Due to the form of graphs consistent with CV, a block consists of nodes x[t,i] where i ranges 
over one of the intervals VIj , and t ranges over an interval of "time coordinates" (see Figure 7) . 
We refer to the interval number as the height of the block, so a block is higher if it corresponds 
to a higher-numbered interval. The last lemma indicates that there is no arc that connects a 
block to a higher one. By the length of the block we mean the number of MCs it spans. 

An in-situ arc, mentioned in the proof bleow, is an arc (of an MC graph) connecting Xj to 
x'^ for some i. 

LEMMA 5.13. Suppose that cl{£)\f has an ^P-consistent set of MCs where CP is a level 
paHition of depth L. Then TUN) G n{N^). 
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interval 

4 o < o < o < o < o < o < o 



3 o o > o o y o o o 

>K 

2 o • > • > • > • > • o 

"4^ -4' -I' "4' 

2 o • > • > • > • > • o 

-4' '4' "4^ 

I o > o > o > o > o > o > o 

Figure 7: A multipath with a highlighted block (the black nodes), corresponding to interval 
number 2. The whole of the lowest row is also a block. 

Proof. The idea of the proof is to construct a multipath M of length £{N) G Q,{N^), more 
precisely i{N) = [N/n\^ — 1, that is A''-satisfiable. The multipath is constructed by the following 
formula. Let b = [N/n] and let 1 < t < 6^. Denote by r]{t) the maximal number h such that 
b'^-^ divides t; clearly, 1 < rj{t) < L. Let 

M = G^{i)Gj^{2) ■ ■ ■ Gr,(bL-l) ■ 

Observe that a block of level h is delimited on its left and right sides either by the end of the 
multipath, or by an occurrence of Gk for some k > h. By the definition of r/, the length of each 
level h block is 6'* — L 

To show that M is iV-satisfiable, we use Lemma 5.4. Consider a directed path r in M. 
Suppose that r enters a block associated with interval j, at level h. The block is a rectangular 
array of nodes of dimensions x |V^/j |, and contains 5 — 1 occurrences of Gh, which is the MC 
active at this level, that is, includeing strict in-situ arcs within the block; other MCs in the 
block have a lower level and therefore have non-strict in-situ arcs. It is not hard to see that the 
lightest path in the block is one that starts at the upper left corner of the block and ends in the 
lower right corner, and its weight cannot be smaller than —{b-\ Vli \ ) . 

Once r exits a block, it must proceed to a lower block (by Lemma 5.11), so it visits a block 
at each height at most once. We conclude that the weight of r is lower-bounded by 

w w 

^ -(6 • \VIi\) = -h(^ \VIi\) = -bn > -{N/n)n = -N 

i=l i=l 

which proves that M is A'^-satisfiable. □ 
5.5 Lower bound implies a level partition 

We now take the converse direction, and prove that if a sufficiently long multipath exists, the 
condition in Theorem 5.10 holds. 
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Recall that a thread is either a down-thread or an up-thrcad. A thread is in-situ if it 
consists only of in-situ arcs. Below, we reason on the structure of threads. The treatment of 
down-threads and up-threads is essentially the same, and to avoid cumbersome notation, we will 
restrict the reasoning to down-threads. 

Definition 5.14. Let M = G1G2 ■ ■ ■ Gm be a multipath. A segment of M is a multipath 
GbGb^^i . . .Ge for some 1 < b < e < m. A suh-multipath of M is a multipath M' = H1H2 . . . H^i 
such that there is a subsequence 7 = of (l,...,m -|- 1), where for all j, Hj = 

Gij-i'r ■ ■ ; Gi.-i. We write this as M' = My. 

LEMMA 5.15. If M is an N-satisfiable cl{£)\f -multipath, so is any sub-multipath M' . 

Proof. M' is a cl{£) ["/-multipath because cl{£) \f is closed under composition. It is A^-satisfiable 
because every path in M' is obtained from a path of M by contracting some segments of the 
path due to MC composition. A strict segment is contracted to a strict arc and a non-strict 
segment to a non-strict arc; at any case, the weight of the contracted path is at least the weight 
of the original one. □ 

LEMMA 5.16. Let M he a cl{£)\f -multipath of length at least (n + l)m, for some m > 0. 
There is a sub-multipath M' of M, of length m, that has an in-situ strict thread. 

Proof. Since cl{£)\f is terminating, by Lemma 3.13, it has a strict complete thread r. For 
< t < m, let v{t) denote the index such that x\t,v{t)] is on r. Since the length of M is 
at least {n -\- l)m, there must be an i such that v{t) = i at least for m -|- 1 different indices 
t = ii, ^2, ■ ■ ■ 5 im+i- Hence, M^j^ is the desired sub-multipath. □ 

LEMMA 5.17. Suppose that for some N > 1, cl{£)\f has an N-satisfiable multipath longer 
than {n + l)"-^'^{nN -\- 1)''""^. Then there is an >C7-'-consistent set of MCs where CV is a level 
partition of depth L. 

Proof. Let Mq be an A^-satisfiable multipath of length (n+ l)"+^(nA^+ 1)^^-*^. We will construct 
a sequence of multipaths, starting with Mq, each multipath being a sub-multipath of the former. 
This sequence will help us find the desired set of MCs. Let fco = 0. 

By last lemma, there is a sub-multipath Mi of length (n + l)"^^^^(nA^ + 1)^^^ that includes 
an in-situ strict (down-)thread tq, say at Xjg. Suppose that Mi has another complete thread, 
disjoint from tq. Then arguing as in the proof of Lemma 5.16, we get a sub-multipath M2, of 
length {n + l)'^^^~'^{nN + l)"^"^, that includes two in-situ threads, tq and ti. We continue this 
way until wc reach a multipath Af^-^, of length {n-\- \)^'^^~^^{nN -\- 1)"^"^, that does not include 
any additional complete thread. All the threads tq, . . . ,Tfcj_i are complete and in-situ, and at 
least one of them must be strict (otherwise. Lemma 3.13 is contradicted). 

Since Mq is A?^-satisfiable, so is M^^, and each of the threads above includes at most N 
strict arcs. Altogether, they include at most nN strict arcs. Suppose L > 1. We are now looking 
for a long segment of M^^ avoiding these strict arcs. If we drop from M^^ the MCs where such 
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a strict arc appear, we get at most |Mfe^| — nN segments. Hence one segment is at least 

{Mk,\-nN 



nN + l 



> 



nN + l - 



nN + l - \ J V / > 

Call this segment M'f,^. Note that all the arcs of tq, . . . , rfej_i are non-strict in M'f,^. We treat 
Mj(.^ as we treated Mq and generate additional sub-multipaths M^^ + 1, . . . , M^^ such that M^^ 
has precisely /c2 complete in-situ threads, including the ki present in M^^; and we argue, as 
before, that at least one of the new — ki threads is strict (those inherited from M^^ are not, 
due to the choice of M^J. The length of Mfc^ is at least (n + l)"+^-'=2-i(nA^ + 1)^-2. if L > 2, 
wc find M'j^^ of length at least (n + l)"-^^^^'^^'^ [nN + 1)'^"^, and so on, until wc arrive at a 
multipath M^^, in which no further in-situ threads can be discovered (the reader is invited to 
check that the length of every Mfc., for j < L, is large enough to allow the construction to 
continue) . 

Now, we construct a level partition CV of depth L. We let level{i) be h if the first j, such 
that Mfc,, includes an in-situ thread at Xj, is L — h + 1. We thus assign levels from L to 1. 
We assign level to any remaining variables. Sets VIi, . . . , VI^ are now defined as maximal 
intervals of variables that have the same level. 

For 1 < /i < L, let G/i = M^ . foi j = L — h + 1. Then Gh is consistent with CV at level h. □ 



5.6 Wrap-up 

By combining Lemmas 5.13 and 5.17, Theorem 5.10 is proved. Moreover, we obtain 

COROLLARY 5.18. The reachability bound for f in cl{£)\f must be in @{N^) for some L, 
namely the largest L such that an >C7^-consistent set of MCs exists, where CP has depth L. 

Proof. Let L be as above; then by Lemma 5.13, the reachability bound is in Q.{N^). However if 
it is not in 0{N^), then for sufficiently large N the bound must exceed {n+ l)"'^^^^{nN + 1)^; 
now Lemma 5.17 implies that there is a level partition of depth L + 1 such that an CP -consistent 
set of MCs exists, contradicting our assumption. □ 



5.7 A Decision Algorithm and its Complexity 

Prom Theorem 5.10, we immediately get the following decision procedure for RED. The algo- 
rithm that follows naturally calls for constructing cl{£)\f and testing for the existence of a 
suitable level partition and a consistent set of graphs. We define the algorithm's input to be the 
instrumented abstract program, since one can in fact modify the instrumented problem in ways 
that may be useful (for example, we may want to establish bounds not in terms of all initial 
variables, but only a selected few; this only requires a slight modification of the instrumentation). 
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Algorithm 5.1. (Reachability Bound Degree) Input: Instrumented abstract program I{A), 
flow-point / G F-^ and degree d. 

1. Build cl{S)\f. 

2. Search for a level partition CV of depth d and an jOV -consistent set of MCs. 

3. Report the conclusion according to Theorem 5.10. 

Since the size of cl{£) \f is exponential in the size of A, this algorithm is exponential in both 

time and space, even if we implement the search for CV very cleverly (which we don't). Instead, 
we tighten the complexity bound to PSPACE in the following theorem. 

THEOREM 5.19. The RBD problem is PSPACE-complete (and is PSPACE-hard even for 
stable systems that have a single flow-point, and guaranteed to be bounded-terminating) . 

Proof. Upper bound. We give a non-deterministic polynomial-space algorithm for recognizing 
RBD. By Savitch's theorem, the problem lies in PSPACE. 

As described in Section 3.4, it is possible to generate transitions of cl{£)\f in polynomial 
space, given access to A. Our algorithm uses this procedure to sample, non-deterministically, A 
set of MCs Gi, . . . ,Gl. The algorithm then attempts to construct a level partition that these 
graphs are consistent with — a simple polynomial-time procedure (though, since the algorithm 
is already non-deterministic, it could as well just guess the level partition and then verify con- 
sistency). The algorithm accepts if a level partition and a consistent set of graphs have been 
found. 

Lower bound. To prove PSPACE-hardness, we reduce from the bounded termination problem 
for stable systems with a single flow-point. Let .4 be a (A^C,Z)-CTS program with n variables. 
Add new variables yi through yn+i- Add constraints to ensure that these variables will be 
bounded in I (A). Prom every MC G, create n + 1 copies, and in copy i include the relations: 
yi > yl and yj > y'^ for j > i. These variables ensure termination in 0{N"'~^^) steps. However, 
if A is in itself bounded-terminating, the system will terminate in 0{N"') steps. Thus, the 
bounded termination of A can be decided by testing the new system for a reachability bound in 
n(iV'*+i). □ 

6 Significance for Concrete Programs 

{MC, Z)-CTSs may be considered as an abstract computational model and its analysis as a goal 
in itself, which is interesting since such systems, despite the relative simplicity of the constraints, 
may exhibit a complex behaviour. However, we would like to promote the view that such systems 
are useful as an abstraction of concrete programs, to facilitate their analysis. 

In this section we consider the question: What does the fact that a (7WC,Z)-CTS has 
polynomially bounded height tell us about the program it represents? We discuss this question 
in three settings, which we first present informally; secondly, we give a formal example using 
a toy programming language defined for this sake only; and finally relate our discussion to the 
implementation of two published analysis tool which use the abstract and conquer approach. 
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6.1 The three settings, informally 

Flat imperative programs We first consider imperative programs witiiout any procedure 

calls. Figures 1 and 2 are examples of flat imperative programs abstracted in the natural way. 
The control-flow graph corresponds to the flow-chart of the program; transitions correspond to 
program instructions, or — more effectively — basic blocks. Often, the assumption is that such a 
block takes a constant time to execute. 

In this setting, the height of the transition system represents the time complexity of the 
program. In terms of complexity classes, this allows us to identify a program as polynomial-time 
in the selected input paramters. When basic blocks have associated costs which are not uniform, 
the Reachability Bound analysis may allow for infering a bound on the cost of a computation 
based on the formula X]jCOst(/) x Tf where / ranges over flow-points and Tf is a reachability 
bound for /. 

In practice, the control-flow graph of the program may be transformed during abstraction. 
Suppose that we select a set of cut points in the program's flow-chart such that any cycle must 
traverse a cut point, and the program entry is a cut point. Any such set of cut points may be 
chosen as the set of flow-points as long as any (finite) path between two cut points is represented 
by an abstract transition. The conclusions on the concrete program's complexity remain valid. 

For programs that contain procedure calls, but not recursion, a bottom-up analysis may be 
applicable. The results of analyzing a procedure p will be plugged into the summation for its 
caller, using reachability bounds, as shown above ([AAGPIO] also describes a bottom-up process, 
however their analysis is not based on the RB approach). 

Pure-functional programs [LJBAOl] showed the simplest way in which a (first-order, eager) 
pure-functional program may be abstracted. The control-flow graph is the call-graph of the 
program; fiow-points are function names and transitions correspond to functions calls. Hence, 
every call chain of the program corresponds to a particular run of the transition system. 

It should be clear that in this setting, the height of the transition system represents the 
stack height of the program. This is a resource of practical importance in itself. What can 
we infer in terms of the traditional resources, space and time? The pure functionality suggests 
that there is no iteration but recursion, so it may be possible to bound the execution time of 
a function body, or the space it consumes, outside any calls it performs; often, this bound will 
be a constant. If functions cannot allocate "heap space" at all, the stack height corresponds 
to space usage. If the functions can allocate space outside the stack, exponential space may be 
consumed for a polynomial stack height. Because the call tree is a tree of bounded degree, we 
obtain an exponential time bound (that is, a constant to a polynomial power). 

In terms of complexity classes, we may conclude that the program is polynomial-space or 
only the weaker result that it has the class EXPTIME. 

Also in this setting, we note that the abstraction may create the CFG in different ways, 
which are sometimes useful. For example, [MV06b] chose call sites to be flow-points rather than 
function names. 

6.2 Analysing a simple programming language 

We demonstrate the ideas more formally by deflning three variants of a simple (but Turing com- 
plete) functional programming language SFPL and a simple-minded, conservative abstraction A 
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mapping SFPL programs to {MC, Z)-CTSs. Since the language has functional style, imperative 
programs are represented by tail recursion. 

The syntax of SFPL is defined in Table 1, and further explained below. Semantically, SFPL 
programs operate on strings over a finite alphabet S = {0,1,...}. The expression a:x, where 
a G S, evaluates to a followed by the value of x. 

A program is a collection of definitions which leaves no undefined identifiers. A function is 
defined by a set of definitional patterns. To avoid ambiguity, a first-match disambiguation rule 
can be used. If there is no match, the program halts. A wildcard "?" can be introduced in 
patterns as syntactic sugar. For simplicity, all functions have the same arity n. A function is 
indicated as the entry point. 

Example 6.1. Here is a short SFPL program that tests two strings for equality, where S = {0, 1}. 
For some complication, it occasionally swaps its arguments. We use the first-match rule for 
pattern matching. 

f{e,e) = l 

f{0:xi,Q:x2) = f{xi,X2) 
f{l:xi,l:x2) = f{x2,xi) 
f{xi,X2) = e 

The specification of function bodies and their return values differs in the three language 
variants: 

SFPLi Allows only the simple and the tail-recursive expressions as function bodies. Hence, 
it represents imperative programs. The return value of functions is S*. 

SFPL2 Allows, in addition, the conditional expression (see Table 1). The condition gi{...) 
is evaluated first; if it is a non-empty string, the value of the conditional is obtained by 
evaluating g2{. ■ ■), and otherwise, g3{. . .). 

SFPL3 Also includes the nested ( "let" ) expression. 

Definition 6.1. Abstraction A maps an SFPL program to a (A^C,Z)-CTS as follows: the 
flow-point set F is the set of defined functions. There is an abstract transition G : f ^ g iov 
every call expression g{j3i^ . . . , fin) in a definition /(ai, . . . , = • . . ; a relation among Xi and 
x'j is included in G, dependent on the patterns and as specified in the following table (the 
cases missing in the table contribute no constraint). 

ai (3j relation 

£ £ 
Xi £ 
Xi Xi 

dtXi £ I Xi 

a:xi a' 



> 

> 
> 
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Prg 3 p ::= Di ... Dn 

Dfn B Di ::= /(tti, . . . , 7r„) = e 

Expr 3 e ::= ai 

(simple expression) 
I g{ai, . . . ,a„) 

(tail-recursive expression) 

I gi{ai,...,an) ? 52(01, •••,«„)> 5'3(a'i', O 

(conditional expression) 
I let y = gi{ai,...,an) in 52(^5 /^n) 
(nested expression) 
Pat 3 TTi ::= £ | | a:Xi 

(parameter pattern) 
APat 3 j3i ::= £ | a | Xj | a:xj \ b:a:xj 

(actual parameter) 
APat' 3 ol^ ::= | y 

(extended actual parameter) 

a,6GS G {l,...,n} 

Table 1: Syntax of SFPL 



We can now state our observations in this formal setting. 

We assume a typical RAM implementation of SFPL, using a stack for function calls, and 
a heap memory to keep the strings, which are implemented as linked lists, so that removing 
or adding an element at the front takes constant time and space. We also assume immediate 
garbage collection so that garbage does not accumulate (this is easy for such a language, e.g., 
by reference counting). 

CLAIM 6.2. //A(P) satisfies bounded termination, andP is an SFPLj program, then, for alii, 
the stack height is polynomially bounded in the size of the input strings. For i = 1, the program 
runs in polynomial time; for i = 2, its space usage is polynomial; and for i = 3, its running time 
is bounded by 2P°'2'("). 

A formal proof of this claim is skipped as it is uninteresting and tedious (demanding a 
formalization of semantics and complexity, currently left informal). The time bound in the case 
of SFPLi is straightforward and that of SFPL3 follows almost as easily since the height of 
the recursion tree is polynomial. As to the space bound for SFPL2, note that a branch in the 
recursion tree only occurs in this language when a conditional is evaluated, and that heap space 
allocated by the evaluation of the condition (gi) can be discarded once it is determined whether 
the return value is e or not. Thus, for the purpose of bounding the space, it is possible to 
consider the stack height. 

Note that our language is, in fact, a Turing-complete one. It is possible to extend Claim 6.2 to 
a proposition of class capture: every decision problem in PTIME (resp., PSPACE, EXPTIME) 
may be represented by an SFPLi (resp., SFPL2, SFPL3) program. We find that this result is 
of little consequence to the main goals of our work, and have decided to omit the proof. 
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6.3 A discussion of two analyzers for real-world languages 

We compare our informal statements at the beginning of this section to the way abstraction 
is used in the WTC project by Ahas ct al. [ADFGlOa] and the COSTA project by Albert et 
al. [AAG+OO, AAG+12]. As described in Section 1, both works use a constraint language richer 
than monotonicity constraints, but this issue is independent of the current discussion (it may 
affect precision of the abstraction — see the next section) . 

In [ADFGlOa], C language programs are abstracted to afRne constraint transition systems. 
They have implemented two forms of abstraction. One represents a basic block as a transition, 
another only places a flow-point at a loop header and expands the loop body so that every 
path through the loop is abstracted to one abstract transition. This means that exponentially 
more transitions may be generated, but the abstraction will be more precise. In both cases, our 
informal description for "flat imperative programs" applies. 

In [AAG+09, AAG+12], Java Bytecode programs are abstracted to transition systems which 
express a sequential transition (from a block in the flow-chart to the next) and a procedure call in 
essentially the same way. Thus a sequential computation is treated as tail recursion — much like 
in our toy language. The analysis described in [AAGPIO] distinguishes the case of tail recursion 
from the case where a recursion tree is involved and an exponential bound may result. This 
is again similar to the framework we have described. Their abstract programs are annotated 
with cost expressions, used in computing a closed formula for a cost bound. As stated earlier, 
in our framework this may require the computation of reachability bounds and a (symbolic) 
summation, and possibly also another static analysis to bound the cost expressions in terms of 
input parameters. 

There are other tools that translate real- world languages to some kind of contraint transition 
systems, for example [SMPIO] analyze Java Bytecode and [MV06b] analyze the ACL2 program- 
ming language, both for the purpose of termination analysis. Since the correspondence of the 
abstract program to the concrete one is still essentially as in our discussion, we conclude that 
the generated abstract programs could be used, perhaps with some adaptation, for cost analysis 
as well. 

6.4 Reflections on Effective Abstraction 

Both of the tools we cited in the last section use a more expressive abstraction — an affine- 
constraint CTS (also known as a GTS with polyhedral constraints). This constraint language 
is strictly more expressive, as monotonicity constraints form a simple special case of affine 
constraints. So there is reason to fear that by abstracting a program to a (A^C,Z)-CTS we 
might lose crucial information. We would like to argue that this consideration should not 
discourage researchers from employing this abstraction. 

One reason for our optimism is the existing empirical evidence for the cfl'ectivcncss of the size- 
change technique in termination analysis [LS97, CT99, TG05, MV06b, BAG08, SMPIO, KH09, 
GGBA+11]. As shown in our theoretical sections, the complexity analysis is a refinement of 
termination analysis and reuses its methods. Nonetheless, we argue that for bounded termination 
analysis, it is necessary to transfer more information to the (A^C,Z)-GTS than one does for 
termination, in particular if one wants to analyze it as a stand-alone abstract program. The 
main reason is the necessity for bounding variables. Consider Program 2 in Figure 2 on Page 6. If 
the initial assignment is changed from i=N to i = 2*N, and the abstract variables still correspond 



34 



to the program variables in a one-to-one fashion, we will lose the bound on i in terms of N, since 
it is not a monotonicity constraint. Note that this relation is not necessary for the termination 
proof, but is crucial for deducing bounded termination. 

We think that this problem may be mitigated by the use of an auxiliary bound analysis, one 
which attempts to bound expressions in the program in terms of the designated input variables. 
Such an analysis can be performed by, for example, polyhedral analysis [CH78] or one of its 
many variants. When an expression exp is found to be bounded by a bound B^^p in terms of 
the input, an abstract variable representing Bexp may be added to the abstraction. In order 
to avoid combinatorial explosion, one may decide to add such variables only when necessary 
for changing an unbounded variable in the (A^C,Z)-CTS into a bounded one; one may also 
opt to keep only a representative of the maximum among such expressions, in the same way 
we used Xmax in Section 4. Note that if we have an analysis that (unlike polyhedral analysis) 
may ascertain a non-polynomial bound on exp we may end up with complexity bounds that are 
polynomial functions of that bound, hence possibly non-polynomial as a function of the input 
parameters. 

We also invite the reader to note that {MC, Z)-CTSs can capture rather complex behaviours. 
The examples in the next section illustrate a few. This should be at least a reason to consider 
the model interesting. 

7 Additional Examples 

To illustrate the variety of loop structures that can be represented and analysed, we have selected 
a few examples, shown in this section as C program fragments; see also examples on pages 4, 6, 
19. In all these examples, it is pretty simple to verify that the associated constraints systems 
are indeed bounded terminating. 

Example 7.1. This is a quadratic-time example (similar to Figure 2, but counting up rather than 
down) , from [GMC09] , where it is analysed by means of counter instrumentation and bounding. 

SimpleMultipleDepCint n, int m) { 
X = 0; y = 0; 
while (x < n) 

if (y < m) y++; 

else { y = 0; x++; } 




o 




(1) 



(1) x' = 0'Ay' = 0' 

(2) X < n A Same(m, n, y, 0) 

(3) y < m A y < y' A Sanic(m, n, x, O) 

(4) y' = 0' A X < x' A Same(ni, n, O) 



o 
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Example 7.2. The next example is from [GMC09]. They explain that their algorithm does not 
handle it because of the lack of path-sensitive information. Alain et al. report in [ADFGlOb] 
that their tool solved this instance, 
void pathSensitive2(int n, int b, int x) { 
int t; 

if (b>=l) t=l; else t = -1; 
while (x<=n) { 
if (b>=l) 
x=x+t ; 

else 

x=x-t ; 

} 

} 

In its MC representation, we represent the effect of addition and subtraction disjunctively: for 
example, wc use the knowledge that x = x+t is a command that increases x if t is positive, 
decreases x if t is negative, etc. Thus we have three MCs for each command of this form. In 
this particular program, two of those represent transitions that will never be taken in an actual 
run, but we do not assume our "front end" to do such an analysis. 

b > A t' > 0' A Same(b, n, O) 
b < A t' < 0' A Same(b, n, 0) 
X < n A Samc(x, b, n, t, O) 
b>OAt>OAx'>xA Same(b, n, t, O) 
b>OAt<OAx'<xA Same(b, n, t, O) 
b>OAt = OAx' = xA Same(b, n, t, 0) 
b<OAt>OAx'>xA Same(b, n, t, O) 
b<OAt<OAx'<xA Same(b, n, t, O) 
b<OAt = OAx' = xA Same(b, n, t, O) 

Example 7.3. The next program does not have a lexicographic-linear global ranking function, 

an obstacle for tools that, explicitly or implicitly, require functions of this kind (this class 
includes [ADFGlOa], by their own description, and also COSTA, though the fact is implicit — 
see Section 2. The class also includes the algorithm of [GMC09], according to a discussion 
in [ADFGlOa]). We omit the transition system this time, which the reader would be able to 
create at ease (for assignments like y = y+x it suffices, in this case, to consider y as being 
unconstrained, although a disjunctive representation of the effect, as in the previous example, 
could be harmlessly included). 

void min(int x, int y) { 

while (y > && X > 0) { 
if (x>y) z = y; 

else z = x; 

if (*){ y = y+x; x = z-1; z 

else { X = y+x; y = z-1; z 

} 

} 



o 

(2)( 
— ^ o ^ 



(7,8,9) (3) 



(4,5,6) 



(1) 
(2) 

(3) 
(4) 
(5) 
(6) 
(7) 



(9) 



= y+z } 

= x+z } 
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Another instance where lexicographic hnear global ranking functions do not suffice is given 
in Figure 6 (Page 19). 

Example 7.4. The following example from [GZIO] shows the weakness of a straight-forward 
abstraction to monotonicity constraints. 

i = 0; 

while (i < n) { 
j = i + 1; 
while (j < n) { 
if (A[j]) 

j~; n— ; 

} 

i++; 

} 

The problem is that abstracting the effect of the if -block on j to j' < j does not allow a 
later analysis to figure out that j++ "undoes" this decrement. There are, of course, multiple 
ways to handle this issue. For example, one could use a more expressive abstraction — say, 
(^j^, Z)-CTS — and use it for computing a composition in the closure algorithm, widening to 
monotonicity constraints only at the level of cycles. This still allows the use of {MC, Z)-CTS 
algorithms for the bound analysis. 

8 Conclusion 

The Monotonicity Constraint abstraction came into being specifically for the purpose of termi- 
nation analysis [CT99, LS97, Sag91]. It is natural to wish to extend termination proofs into 
complexity bounds. This work does it for the MC framework. For abstract programs, the com- 
plexity problem is to bound the length of transition sequences. Pleasantly, we find that the 
problem is decidable, and its computational complexity is the same as termination. An inter- 
esting conclusion is that a bound exists if and only if a polynomial one does (a diff^erent kind 
of statement than stating that a certain analysis tool only finds polynomial bounds!). We have 
investigated the problem of obtaining polynomials of minimum degree, and found that to be 
also computable and in PSPACE. 

Since we are dealing with abstract programs, the question of relating these bounds to com- 
plexity of the concrete program arises. We illustrate how the polynomial bound may mean 
polynomial time, space or a polynomial exponent. In fact, classes PTIME, PSPACE and EXP- 
TIME may all be captured by very simple abstraction of programs to constraint systems. 

We have not yet been able to perform an empirical evaluation, but at least theoretically, 
our results sustain the claim that, just as they proved quite useful for termination, MCs can 
contribute to complexity analysis. 

In this research, we chose to relax the expression of the bound to a big-0 one so we can 
show that the precise degree is decidable. We propose as an open problem the question whether 
bounds that have precise explicit constants can also be computed (in polynomial space?). 
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